Virtual Talking Heads and audiovisual articulatory synthesis
نویسندگان
چکیده
Our approach to audiovisual articulatory synthesis involves the development of Virtual Talking Heads that integrate the articulatory, aerodynamic and acoustic phenomena underlying speech production. Specifically, these Talking Heads are faithful clones of the speakers whose data the various models are based on. Our contribution presents some of the results achieved at ICP in this domain: 3D orofacial linear articulatory models, made possible by recent progresses in medical imaging and video processing techniques; aerodynamic and acoustic models, and basic glottis / oral constriction coordination principles; development of various strategies to determine the articulatory control parameters (coarticulation models vs. simple concatenative strategies); text-to-audiovisual speech synthesis. We finally make some suggestions for future developments.
منابع مشابه
Creating and controlling video-realistic talking heads
We present a linear three-dimensional modeling paradigm for lips and face, that captures the audiovisual speech activity of a given speaker by only six parameters. Our articulatory models are constructed from real data (front and profile images), using a linear component analysis of about 200 3D coordinates of fleshpoints on the subject's face and lips. Compared to a raw component analysis, our...
متن کاملTowards an Audiovisual Virtual Talking Head: 3d Articulatory Modeling of Tongue, Lips and Face Based on Mri and Video Images
A linear three-dimensional articulatory model of tongue, lips and face is presented. The model is based on a linear component analysis of the 3D coordinates defining the geometry of the different organs, obtained from Magnetic Resonance Imaging of the tongue, and from front and profile video images of the subject’s face marked with small beads. In addition to a common jaw height parameter, the ...
متن کاملA comparison of German talking heads in a smart home environment
The authors describe a newly developed German Text-Toaudiovisual-Speech (TTavS) synthesis system based on the English speaking HeadZero. Targets of the control parameters of the talking head are generated by mapping of German phonemes to the originally English visemic blend shapes controls. The resulting German version of HeadZero and the German talking head MASSY were extended to generate audi...
متن کاملArtimate: an articulatory animation framework for audiovisual speech synthesis
We present a modular framework for articulatory animation synthesis using speech motion capture data obtained with electromagnetic articulography (EMA). Adapting a skeletal animation approach, the articulatory motion data is applied to a threedimensional (3D) model of the vocal tract, creating a portable resource that can be integrated in an audiovisual (AV) speech synthesis platform to provide...
متن کاملPhoneme-level articulatory animation in pronunciation training
Speech visualization is extended to use animated talking heads for computer assisted pronunciation training. In this paper, we design a data-driven 3D talking head system for articulatory animations with synthesized articulator dynamics at the phoneme level. A database of AG500 EMA-recordings of three-dimensional articulatory movements is proposed to explore the distinctions of producing the so...
متن کامل